Abortion study EU datasets¶
In this study, we will examine abortion data across Europe, with a particular focus on Romania, and aim to answer several questions:¶
How did the abortion rates change over time across EU countries?​
How do EU countries rank in terms of abortion numbers?​
How do demographics like age and region affect abortion rates?​
Do major events or new legislation impact the number of abortions?​
Is there a correlation between the natality and the number of abortions?​
​
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.ticker import FuncFormatter
# Load the datasets
eu_1 = pd.read_csv('eu_ab_1.csv')
eu_2 = pd.read_csv('eu_ab_2.csv')
eu_pop = pd.read_csv('eu_population_count.csv')
eu_births = pd.read_csv('eu_live_births.csv')
eu_births_by_age = pd.read_csv('eu_births_by_age.csv')
eu_1 = eu_1[(eu_1['geo'] != 'Germany including former GDR') & (eu_1['geo'] != 'Metropolitan France')]
eu_pop = eu_pop[(eu_pop['geo'] != 'Germany including former GDR') & (eu_1['geo'] != 'Metropolitan France')]
print(eu_1.isnull().sum())
print(eu_2.isnull().sum())
DATAFLOW 0 LAST UPDATE 0 freq 0 unit 0 age 0 geo 0 TIME_PERIOD 0 OBS_VALUE 0 OBS_FLAG 7540 CONF_STATUS 7595 dtype: int64 DATAFLOW 0 LAST UPDATE 0 freq 0 indic_de 0 unit 0 geo 0 TIME_PERIOD 0 OBS_VALUE 0 OBS_FLAG 497 CONF_STATUS 502 dtype: int64
C:\Users\Tudor\AppData\Local\Temp\ipykernel_9228\79747375.py:10: UserWarning: Boolean Series key will be reindexed to match DataFrame index. eu_pop = eu_pop[(eu_pop['geo'] != 'Germany including former GDR') & (eu_1['geo'] != 'Metropolitan France')]
eu_1.describe()
| TIME_PERIOD | OBS_VALUE | CONF_STATUS | |
|---|---|---|---|
| count | 7595.000000 | 7.595000e+03 | 0.0 |
| mean | 2004.618565 | 1.589860e+04 | NaN |
| std | 12.723524 | 5.593180e+04 | NaN |
| min | 1960.000000 | 0.000000e+00 | NaN |
| 25% | 1998.000000 | 1.200000e+02 | NaN |
| 50% | 2006.000000 | 1.771000e+03 | NaN |
| 75% | 2014.000000 | 1.046700e+04 | NaN |
| max | 2023.000000 | 1.407042e+06 | NaN |
Q1: How have abortion rates changed over time across EU countries?¶
abortions_by_year = eu_1[eu_1['age'] == 'Total'].groupby('TIME_PERIOD').OBS_VALUE.sum().reset_index()
abortions_by_year.columns = ['Year', 'Total Abortions']
abortions_by_year
| Year | Total Abortions | |
|---|---|---|
| 0 | 1960 | 1394005.0 |
| 1 | 1961 | 448654.0 |
| 2 | 1962 | 364640.0 |
| 3 | 1963 | 361941.0 |
| 4 | 1964 | 385124.0 |
| ... | ... | ... |
| 59 | 2019 | 523981.0 |
| 60 | 2020 | 762567.0 |
| 61 | 2021 | 634463.0 |
| 62 | 2022 | 655626.0 |
| 63 | 2023 | 460367.0 |
64 rows × 2 columns
plt.figure(figsize=(20, 10))
plt.title('Total Abortions by Year in the EU')
plt.xlabel('Year')
plt.ylabel('Total Abortions (in milions)')
plt.plot(abortions_by_year.Year, abortions_by_year['Total Abortions']/ 1_000_000, marker='o')
plt.tight_layout()
plt.grid(True)
years = abortions_by_year['Year']
plt.xticks(np.arange(years.min(), years.max() + 2, 2))
plt.show()
It seemes that somthing might of happened right before 2010 where we had a big rise and than a fall of the abortion numbers per year
Maybe visialising every country in the eu might help us understand this better.
plt.figure(figsize=(20, 20))
index = 1
for location in eu_1.geo.unique():
abortions_per_country_by_year = eu_1[(eu_1['geo'] == location) & (eu_1['age'] == 'Total')].groupby('TIME_PERIOD').OBS_VALUE.sum().reset_index()
abortions_per_country_by_year.columns = ['Year', 'Total Abortions']
plt.subplot(7, 6, index)
index += 1
plt.title(location)
plt.xlabel('Year')
plt.ylabel('Total Abortions (in milions)')
plt.plot(abortions_per_country_by_year.Year, abortions_per_country_by_year['Total Abortions']/ 1_000_000, marker='o')
plt.grid(True)
plt.suptitle('Total Abortions by Year in the EU', fontsize=24, y=1.02)
plt.tight_layout()
plt.show()
Not all contries seem to have a noticible raise in abortions in 2007.
But let's take only Romania:
abortions_romania = eu_1[(eu_1['geo'] == 'Romania') & (eu_1['age'] == 'Total')]
plt.figure(figsize=(20, 10))
plt.title('Total Abortions by Year in Romania')
plt.xlabel('Year')
plt.ylabel('Total Abortions (in milions)')
plt.plot(abortions_romania.TIME_PERIOD, abortions_romania['OBS_VALUE']/ 1_000_000, marker='o')
plt.tight_layout()
plt.grid(True)
years = abortions_romania['TIME_PERIOD']
plt.xticks(np.arange(years.min(), years.max() + 10, 2))
plt.show()
The decline after 1966 happened because abortion was made iligal in Romania,
While the rise in 1990 happened because it was made legal again.
After that it seems it has been on a decline -maybe- due to public awareness around sexual health, contraception, and safe sex practices..
Q2: How do EU countires rank in terms of avort numbers?¶
we'll only take 2023 in consideration
eu_1.groupby('TIME_PERIOD')['geo'].nunique().sort_values(ascending=False)
TIME_PERIOD
2007 28
2017 27
2006 27
2008 27
1992 26
..
1966 7
1961 6
1964 6
1963 6
1962 6
Name: geo, Length: 64, dtype: int64
eu_2.groupby('TIME_PERIOD')['geo'].nunique().sort_values(ascending=False).head(10)
TIME_PERIOD 2017 28 2018 27 2014 26 2015 25 2020 25 2016 24 2019 24 2022 22 2013 21 2023 20 Name: geo, dtype: int64
abortions2023 = eu_1[(eu_1['TIME_PERIOD'] == 2022) & (eu_1['age'] == 'Total')]
abortions2023 = abortions2023.groupby('geo')['OBS_VALUE'].sum().reset_index()
abortions2023.sort_values(by='OBS_VALUE', inplace=True, ascending=True)
plt.figure(figsize=(10, 6))
plt.barh(y=abortions2023.geo, width=abortions2023.OBS_VALUE)
plt.title('Abortion in the EU in 2023')
plt.xlabel('Abortion count')
plt.ylabel('Country')
plt.tight_layout()
Intresting resuls, but I think we'd get a better understanding if instead of only abortions, we'd visualise abortions per population for every country
population2023 = eu_pop[eu_pop['TIME_PERIOD'] == 2022][['geo', 'OBS_VALUE']]
population2023.sort_values(by='OBS_VALUE', inplace=True)
population2023.head(3)
| geo | OBS_VALUE | |
|---|---|---|
| 543 | San Marino | 33698 |
| 333 | Liechtenstein | 39308 |
| 5 | Andorra | 79535 |
First let's visualise the population for each country
plt.figure(figsize=(10, 8))
plt.barh(y=population2023.geo, width=population2023.OBS_VALUE/ 1_000_000)
plt.title('Population in the EU in 2023')
plt.xlabel('Population (in milions)')
plt.ylabel('Country')
plt.tight_layout()
plt.figure(figsize=(18, 8))
population2023 = population2023[(~population2023.geo.str.contains('Euro')) & (population2023.geo != 'Türkiye')]
plt.subplot(1, 2, 1)
plt.barh(abortions2023.geo, abortions2023.OBS_VALUE)
plt.title("Abortions in the EU (2023)")
plt.subplot(1, 2, 2)
plt.barh(population2023.geo, population2023.OBS_VALUE / 1_000_000)
plt.title("Population in the EU (2023)")
plt.tight_layout()
plt.show()
We can see in the two plots above that population seems to be a big factor when talking about abortion, contries with a big population will have a big number of abortions, this seems to be the trend. But there are exceptions, usually estern European cuntries such as Georgia and Moldova.
Maybe looking at the percentages will give us a better idea of what is happening.
Interesting we see that even tho Poland has a bigger Population it has more abortions
abort_per_pop2023 = pd.merge(abortions2023, population2023, on='geo', how='inner')
abort_per_pop2023['OBS_VALUE'] = (abort_per_pop2023['OBS_VALUE_x'] / abort_per_pop2023['OBS_VALUE_y']) * 1_00
abort_per_pop2023.sort_values(by='OBS_VALUE', inplace=True)
abort_per_pop2023
| geo | OBS_VALUE_x | OBS_VALUE_y | OBS_VALUE | |
|---|---|---|---|---|
| 0 | Poland | 160.0 | 36889761 | 0.000434 |
| 1 | Albania | 676.0 | 2793592 | 0.024198 |
| 2 | Lithuania | 2705.0 | 2805998 | 0.096401 |
| 5 | Slovakia | 5891.0 | 5434712 | 0.108396 |
| 13 | Italy | 65449.0 | 59030133 | 0.110874 |
| 8 | Serbia | 8090.0 | 6797105 | 0.119021 |
| 15 | Germany | 103927.0 | 83237124 | 0.124857 |
| 6 | Finland | 7782.0 | 5548241 | 0.140261 |
| 3 | Latvia | 2793.0 | 1875757 | 0.148900 |
| 12 | Romania | 28661.0 | 19042455 | 0.150511 |
| 9 | Portugal | 15870.0 | 10421117 | 0.152287 |
| 10 | Czechia | 16438.0 | 10516707 | 0.156304 |
| 14 | Spain | 98316.0 | 47486843 | 0.207038 |
| 4 | Estonia | 3386.0 | 1331796 | 0.254243 |
| 7 | Moldova | 7896.0 | 2565030 | 0.307833 |
| 16 | France | 233281.0 | 68091703 | 0.342598 |
| 11 | Georgia | 16619.0 | 3688647 | 0.450545 |
plt.figure(figsize=(12, 8))
plt.barh(y=abort_per_pop2023.geo, width=abort_per_pop2023.OBS_VALUE)
plt.title('Abortion as a percent Population in the EU in 2023')
plt.xlabel('Abortion % of pop')
plt.ylabel('Country')
plt.tight_layout()
Weird, maybe when we'll talk about natalities we will understand this graph better.
Q3: How do demographics like age and region affect abortion?¶
Okay now let's see how age influences abortions¶
abortions_by_age = eu_1[eu_1['TIME_PERIOD'] == 2022].groupby('age').OBS_VALUE.sum().reset_index()
abortions_by_age.sort_values(by='OBS_VALUE', inplace=True)
abortions_by_age
| age | OBS_VALUE | |
|---|---|---|
| 0 | 50 years or over | 241.0 |
| 8 | Less than 15 years | 1589.0 |
| 10 | Unknown | 1953.0 |
| 7 | From 45 to 49 years | 4084.0 |
| 6 | From 40 to 44 years | 50376.0 |
| 1 | From 15 to 19 years | 51377.0 |
| 5 | From 35 to 39 years | 115562.0 |
| 2 | From 20 to 24 years | 129504.0 |
| 3 | From 25 to 29 years | 141271.0 |
| 4 | From 30 to 34 years | 148820.0 |
| 9 | Total | 655626.0 |
abortions_by_age.drop(9, inplace=True) # drop toal abortion
abortions_by_age.drop(10, inplace=True) # drop Unknown
plt.figure(figsize=(12, 8))
plt.barh(y=abortions_by_age.age, width=abortions_by_age.OBS_VALUE / 1_000)
plt.title('Abortion by age in the EU')
plt.xlabel('Abortions in thousands')
plt.ylabel('Age of women')
plt.tight_layout()
Let's take only romania
abortions_by_age_Romania2022 = eu_1[(eu_1['TIME_PERIOD'] == 2022) & (eu_1['geo'] == 'Romania')].groupby('age').OBS_VALUE.sum().reset_index()
abortions_by_age_Romania2022.sort_values(by='OBS_VALUE', inplace=True)
# abortions_by_age_Romania.drop(9, inplace=True) # drop toal abortion
# abortions_by_age_Romania.drop(10, inplace=True) # drop Unknown
abortions_by_age_Romania2022 = abortions_by_age_Romania2022[
~abortions_by_age_Romania2022['age'].isin(['Total', 'Unknown'])
]
plt.figure(figsize=(12, 8))
plt.barh(y=abortions_by_age_Romania2022.age, width=abortions_by_age_Romania2022.OBS_VALUE)
plt.title('Abortion by age in Romania in 2022')
plt.xlabel('Abortions')
plt.ylabel('Age of women')
plt.tight_layout()
Let's visualize it in a piechart
import matplotlib.cm as cm
plt.figure(figsize=(10, 8))
sizes = abortions_by_age_Romania2022.OBS_VALUE.values
labels = abortions_by_age_Romania2022.age.values
percentages = sizes / sizes.sum() * 100
def make_label(i):
if i > 3:
return f"{labels[i]}: {percentages[i]:.1f}%"
else:
return ''
pie_labels = [make_label(i) for i in range(len(labels))]
def func(pct):
return f'{pct:.1f}%' if pct > 5 else ''
num_slices = len(abortions_by_age_Romania2022) + 1
cmap = cm.get_cmap('viridis')
colors = cmap(np.linspace(0, 1, num_slices))
wedges, texts, autotexts = plt.pie(
abortions_by_age_Romania2022.OBS_VALUE,
autopct=func,
colors=colors,
startangle=140,
wedgeprops={'edgecolor': 'black'}
)
sorted_indices = np.argsort(-sizes)
legend_labels = [f'{labels[i]}: {percentages[i]:.1f}%' for i in sorted_indices]
sorted_wedges = [wedges[i] for i in sorted_indices]
plt.legend(sorted_wedges, legend_labels, title="Age Groups", loc="center left", bbox_to_anchor=(1, 0, 0.5, 1))
plt.title('Abortion distribution by age group in Romania (2022)')
plt.tight_layout()
plt.show()
C:\Users\Tudor\AppData\Local\Temp\ipykernel_9228\4029927240.py:21: MatplotlibDeprecationWarning: The get_cmap function was deprecated in Matplotlib 3.7 and will be removed in 3.11. Use ``matplotlib.colormaps[name]`` or ``matplotlib.colormaps.get_cmap()`` or ``pyplot.get_cmap()`` instead.
cmap = cm.get_cmap('viridis')
First question that I have, did this change over the time?
plt.figure(figsize=(20, 40))
index = 1
years = eu_1.TIME_PERIOD.unique()
years.sort()
for year in years:
abortions_by_age_Romania = eu_1[(eu_1['TIME_PERIOD'] == year) & (eu_1['geo'] == 'Romania')].groupby('age').OBS_VALUE.sum().reset_index()
abortions_by_age_Romania.sort_values(by='OBS_VALUE', inplace=True)
abortions_by_age_Romania = abortions_by_age_Romania[
~abortions_by_age_Romania['age'].isin(['Total', 'Unknown'])
]
if abortions_by_age_Romania.empty:
continue
plt.subplot(20, 3, index)
index += 1
plt.barh(y=abortions_by_age_Romania.age, width=abortions_by_age_Romania.OBS_VALUE)
plt.title(f'Abortion by age in Romania in {year}')
plt.xlabel('Abortions')
plt.ylabel('Age of women')
plt.tight_layout()
plt.suptitle('Abortions ages each Year in Romania', fontsize=24, y=1.02)
plt.tight_layout()
plt.show()
Alright this does not tell us a lot let's use a heatmap
age_order = [
'Less than 15 years', 'From 15 to 19 years', 'From 20 to 24 years', 'From 25 to 29 years', 'From 30 to 34 years',
'From 35 to 39 years', 'From 40 to 44 years', 'From 45 to 49 years', '50 years and over'
]
import seaborn as sns
import matplotlib.pyplot as plt
romania_ab = eu_1[eu_1['geo'] == 'Romania'].copy()
romania_ab = romania_ab[~romania_ab['age'].isin(['Total', 'Unknown'])]
heatmap_data = romania_ab.pivot_table(index='age', columns='TIME_PERIOD', values='OBS_VALUE', aggfunc='sum')
heatmap_data = heatmap_data.sort_index(axis=1)
heatmap_data = heatmap_data.reindex(age_order[::-1])
plt.figure(figsize=(14, 8))
sns.heatmap(heatmap_data, cmap='Reds', linewidths=0.5)
plt.title('Abortions by Age Group and Year in Romania', fontsize=20)
plt.xlabel('Year')
plt.ylabel('Age Group')
plt.tight_layout()
plt.show()
time_restriction = [1995, 2016]
index = 1
plt.figure(figsize=(25, 10))
for first_year in time_restriction:
plt.subplot(1, 2, index)
index += 1
heatmap_data = romania_ab[romania_ab['TIME_PERIOD'] >= first_year].pivot_table(
index='TIME_PERIOD', columns='age', values='OBS_VALUE', aggfunc='sum'
)
heatmap_data.index = heatmap_data.index.astype(int)
heatmap_data = heatmap_data.sort_index()
heatmap_data = heatmap_data.reindex(columns=age_order, fill_value=0)
heatmap_data = heatmap_data[age_order]
average_abortions = heatmap_data.mean().sort_values(ascending=False)
sorted_age_order = average_abortions.index.tolist()
for age_group in sorted_age_order:
plt.plot(heatmap_data.index, heatmap_data[age_group] / 1_000, label=age_group, marker='o')
plt.title(f'From {first_year} to 2023', fontsize=20)
plt.xlabel('Year')
plt.ylabel('Number of Abortions in thousands')
plt.legend(title='Age Group')
plt.grid(True)
plt.suptitle('Trends in Abortion Numbers by Age Group in Romania (Yearly)', fontsize=16, y=1)
plt.tight_layout()
plt.show()
We can see a decreasing trend in the number of abortions in Romania, by most if not all age groups.
Q4: Do major events or new legislation impact the number of abortions?¶
1.Romania¶
Abortion was legalized in Romania in 1957. However, the situation was complex, with the legalization being followed by a ban on abortion in 1966, according to an article in balcanicaucaso.org. The 1966 ban, Decree 770, was put in place to counter a decline in the birth rate. The ban was later lifted in 1989 after the fall of the communist regime.
plt.figure(figsize=(20, 10))
plt.title('Total Abortions by Year in Romania')
plt.xlabel('Year')
plt.ylabel('Total Abortions (in milions)')
plt.plot(abortions_romania.TIME_PERIOD, abortions_romania['OBS_VALUE']/ 1_000_000, marker='o')
plt.tight_layout()
plt.grid(True)
plt.axvline(x = 1957, color = 'orange', label='abortion made legal in Romania')
plt.axvline(x = 1966, color = 'red', label='abortion made iligal in Romania')
plt.axvline(x = 1989, color = 'green', label='abortion made legal again in Romania')
years = abortions_romania['TIME_PERIOD']
plt.xticks(np.arange(years.min() - 5, years.max() + 10, 2))
plt.legend()
plt.show()
Observations:¶
Although data prior to 1957 is unavailable, we observe a rising trend in abortion rates from 1960 to 1965 — the year before abortion was criminalized.
In 1966, abortion was made illegal in Romania, leading to a sharp and sustained decline in recorded abortion rates. From that point onward, abortions were permitted only under strict medical exceptions. As a result, many women turned to unsafe, illegal procedures, and the ban is estimated to have caused the deaths of approximately 10,000 women. Due to the underground nature of these procedures, it is impossible to determine the exact number of illegal abortions that occurred between 1966 and 1989.
Following the fall of the Communist regime in 1989, abortion was legalized again. This led to a big increase in recorded abortion rates — not only because access was restored, but also because previously undocumented procedures could now be officially registered.
2.Poland¶
Abortion was first legalised in Poland in 1956 and after that it was made iligal in 1993, now let's see poland and also it's neibors arbortion number by years:
import numpy as np
import matplotlib.pyplot as plt
abortions_pl = eu_1[(eu_1['geo'] == 'Poland') & (eu_1['age'] == 'Total')]
abortions_gr = eu_1[(eu_1['geo'] == 'Germany') & (eu_1['age'] == 'Total')]
abortions_lt = eu_1[(eu_1['geo'] == 'Latvia') & (eu_1['age'] == 'Total')]
abortions_cz = eu_1[(eu_1['geo'] == 'Czechia') & (eu_1['age'] == 'Total')]
abortions_sl = eu_1[(eu_1['geo'] == 'Slovakia') & (eu_1['age'] == 'Total')]
abortions_bl = eu_1[(eu_1['geo'] == 'Belarus') & (eu_1['age'] == 'Total')]
plt.figure(figsize=(10, 5))
plt.title('Total Abortions by Year in Poland and Its Neighbors', fontsize=20)
plt.xlabel('Year')
plt.ylabel('Total Abortions (in thousands)')
plt.plot(abortions_pl.TIME_PERIOD, abortions_pl['OBS_VALUE'] / 1_000, marker='o', label='Poland')
plt.plot(abortions_gr.TIME_PERIOD, abortions_gr['OBS_VALUE'] / 1_000, marker='o', label='Germany')
plt.plot(abortions_lt.TIME_PERIOD, abortions_lt['OBS_VALUE'] / 1_000, marker='o', label='Latvia')
plt.plot(abortions_cz.TIME_PERIOD, abortions_cz['OBS_VALUE'] / 1_000, marker='o', label='Czechia')
plt.plot(abortions_sl.TIME_PERIOD, abortions_sl['OBS_VALUE'] / 1_000, marker='o', label='Slovakia')
plt.plot(abortions_bl.TIME_PERIOD, abortions_bl['OBS_VALUE'] / 1_000, marker='o', label='Belarus')
plt.axvline(x = 1956, color = 'gray', label='abortion made ligal in Poland')
plt.axvline(x = 1993, color = 'black', label='abortion made iligal in Poland')
plt.tight_layout()
plt.grid(True)
plt.legend(loc='center left', bbox_to_anchor=(1, 0.5))
years = abortions_pl['TIME_PERIOD']
plt.xticks(np.arange(years.min(), years.max() + 1, 5)) # minor fix: +1 not +10
plt.show()
Alright so it does not seem to be any semnificative rise in the abortion rates of its neighbors when the law was passed
My previous observation might be wrong, the anti abortion agenda in Poland gained power in 1989 after the fall of Comunism in Poland
In 1989 after Comunism fell, there were a lot of anti abortion laws implemented
This movment started in Late 1970s – Early 1980, with a Solidarity Movement beeing started in 1980 by the the Catholic Church
import numpy as np
import matplotlib.pyplot as plt
abortions_pl = eu_1[(eu_1['geo'] == 'Poland') & (eu_1['age'] == 'Total')]
abortions_gr = eu_1[(eu_1['geo'] == 'Germany') & (eu_1['age'] == 'Total')]
#abortions_lt = eu_1[(eu_1['geo'] == 'Latvia') & (eu_1['age'] == 'Total')]
abortions_cz = eu_1[(eu_1['geo'] == 'Czechia') & (eu_1['age'] == 'Total')]
abortions_sl = eu_1[(eu_1['geo'] == 'Slovakia') & (eu_1['age'] == 'Total')]
#abortions_bl = eu_1[(eu_1['geo'] == 'Belarus') & (eu_1['age'] == 'Total')]
plt.figure(figsize=(10, 5))
plt.title('Total Abortions by Year in Poland and Its Neighbors', fontsize=20)
plt.xlabel('Year')
plt.ylabel('Total Abortions (in thousands)')
plt.plot(abortions_pl.TIME_PERIOD, abortions_pl['OBS_VALUE'] / 1_000, marker='o', label='Poland')
plt.plot(abortions_gr.TIME_PERIOD, abortions_gr['OBS_VALUE'] / 1_000, marker='o', label='Germany')
#plt.plot(abortions_lt.TIME_PERIOD, abortions_lt['OBS_VALUE'] / 1_000, marker='o', label='Latvia')
plt.plot(abortions_cz.TIME_PERIOD, abortions_cz['OBS_VALUE'] / 1_000, marker='o', label='Czechia')
plt.plot(abortions_sl.TIME_PERIOD, abortions_sl['OBS_VALUE'] / 1_000, marker='o', label='Slovakia')
#plt.plot(abortions_bl.TIME_PERIOD, abortions_bl['OBS_VALUE'] / 1_000, marker='o', label='Belarus')
plt.axvline(x = 1980, color = 'gray', label='anti abortion movement')
plt.axvline(x = 1989, color = 'dimgray', label='anti abortion laws implemented')
plt.axvline(x = 1993, color = 'black', label='Abortion made Iligal')
plt.tight_layout()
plt.grid(True)
plt.legend()
years = abortions_pl['TIME_PERIOD']
plt.xticks(np.arange(years.min(), years.max() + 1, 5)) # minor fix: +1 not +10
plt.show()
My intuition made me think, that people that can't do abortion in Poland anymore will go to the neighboring countries
But we have to take into account that Poland was not part of EU untill 2004 so to leave the country to get an abortion was very hard at that point.
Q5: Is there a corelation between the natality and the number of abortions?¶
eu_births.head()
| DATAFLOW | LAST UPDATE | freq | indic_de | geo | TIME_PERIOD | OBS_VALUE | OBS_FLAG | CONF_STATUS | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | ESTAT:EQ_FER05(1.0) | 27/03/25 23:00:00 | Annual | Live births - females | Andorra | 2018 | 250 | NaN | NaN |
| 1 | ESTAT:EQ_FER05(1.0) | 27/03/25 23:00:00 | Annual | Live births - females | Albania | 2013 | 17089 | NaN | NaN |
| 2 | ESTAT:EQ_FER05(1.0) | 27/03/25 23:00:00 | Annual | Live births - females | Albania | 2014 | 17077 | NaN | NaN |
| 3 | ESTAT:EQ_FER05(1.0) | 27/03/25 23:00:00 | Annual | Live births - females | Albania | 2015 | 15612 | NaN | NaN |
| 4 | ESTAT:EQ_FER05(1.0) | 27/03/25 23:00:00 | Annual | Live births - females | Albania | 2016 | 15332 | NaN | NaN |
Okay let's first take a look at what is the treand of birthrate in Europe
euro_area = eu_births[eu_births['geo'] == 'Euro area – 20 countries (from 2023)']
plt.figure(figsize=(20, 10))
plt.title('Total Births by Year in Euro area')
plt.xlabel('Year')
plt.ylabel('Total Births (in milions)')
plt.plot(euro_area.TIME_PERIOD, euro_area['OBS_VALUE']/ 1_000_000, marker='o')
plt.tight_layout()
plt.grid(True)
years = euro_area['TIME_PERIOD']
plt.xticks(np.arange(years.min(), years.max() + 1, 2))
plt.show()
It seems Births are on a decline in europe, nothing new here
Let's see how each contre is doing:
plt.figure(figsize=(20, 20))
index = 1
for location in eu_births.geo.unique():
births_per_country_by_year = eu_births[eu_births['geo'] == location].groupby('TIME_PERIOD').OBS_VALUE.sum().reset_index()
births_per_country_by_year.columns = ['Year', 'Total Births']
plt.subplot(10, 5, index)
index += 1
plt.title(location)
plt.xlabel('Year')
plt.ylabel('Total Births (in thousands)')
plt.plot(births_per_country_by_year.Year, births_per_country_by_year['Total Births']/ 1_000, marker='o')
plt.grid(True)
plt.suptitle('Total Births by Year in the EU', fontsize=24, y=1.02)
plt.tight_layout()
plt.show()
The decline in birth rates seems to be a consistent pattern throughout Europe.
Let's take a look at Romania
romania_births = eu_births[eu_births['geo'] == 'Romania']
plt.figure(figsize=(10, 5))
plt.title('Total Births by Year in Romania')
plt.xlabel('Year')
plt.ylabel('Total Births (in thousands)')
plt.plot(romania_births.TIME_PERIOD, romania_births['OBS_VALUE']/ 1_000, marker='o')
plt.tight_layout()
plt.grid(True)
years = romania_births['TIME_PERIOD']
plt.xticks(np.arange(years.min(), years.max() + 1, 2))
plt.show()
Let's put it side by side with the abortion (starting in 2013 since the data for births starts in 2013)
romania_births = eu_births[eu_births['geo'] == 'Romania']
plt.figure(figsize=(10, 5))
plt.subplot(1, 2, 1)
plt.title('Total Births by Year in Romania')
plt.xlabel('Year')
plt.ylabel('Total Births (in thousands)')
plt.plot(romania_births.TIME_PERIOD, romania_births['OBS_VALUE']/ 1_000, marker='o')
plt.grid(True)
years = romania_births['TIME_PERIOD']
plt.xticks(np.arange(years.min(), years.max() + 1, 2))
abortions_romania_after2013 = abortions_romania[abortions_romania['TIME_PERIOD'] >= 2013]
plt.subplot(1, 2, 2)
plt.title('Total Abortions by Year in Romania')
plt.xlabel('Year')
plt.ylabel('Total Abortions (in thousands)')
plt.plot(abortions_romania_after2013.TIME_PERIOD, abortions_romania_after2013['OBS_VALUE']/ 1_000, marker='o')
plt.grid(True)
plt.xticks(np.arange(years.min(), years.max() + 1, 2))
plt.tight_layout()
plt.show()
We will analyze the share of abortions as a percentage of total pregnancy outcomes by year in Romania:
pregnancy_data = pd.merge(
abortions_romania_after2013[['TIME_PERIOD', 'OBS_VALUE']],
romania_births[['TIME_PERIOD', 'OBS_VALUE']],
on='TIME_PERIOD',
suffixes=('_abortions', '_births')
)
pregnancy_data['abortion_percentage'] = (
pregnancy_data['OBS_VALUE_abortions'] /
(pregnancy_data['OBS_VALUE_abortions'] + pregnancy_data['OBS_VALUE_births'])
) * 100
plt.figure(figsize=(10, 5))
plt.title("Abortion as a procentage of 'known pregnancy outcomes'")
plt.xlabel('Year')
plt.ylabel('Procentage of Abortions %')
plt.plot(pregnancy_data.TIME_PERIOD, pregnancy_data['abortion_percentage'], marker='o')
plt.tight_layout()
plt.grid(True)
years = pregnancy_data['TIME_PERIOD']
plt.xticks(np.arange(years.min(), years.max() + 1, 2))
plt.show()
I want to clarify here: 'known pregnancy outcomes' means to the sum of abortion and birth, it does not acount for miscarriages, stillbirths, and other pregnancy outcomes
We can se that this procentage was on a decline until 2021
plt.figure(figsize=(18, 8))
eu_births2023 = eu_births[(~eu_births.geo.str.contains('Euro')) & (eu_births.geo != 'Türkiye') & (eu_births.TIME_PERIOD == 2023)]
eu_births2023.sort_values(by='OBS_VALUE', inplace=True)
plt.subplot(1, 2, 1)
plt.barh(abortions2023.geo, abortions2023.OBS_VALUE)
plt.title("Abortions in the EU (2023)")
plt.subplot(1, 2, 2)
plt.barh(eu_births2023.geo, eu_births2023.OBS_VALUE / 1_000_000)
plt.title("Births in the EU (2023)")
plt.tight_layout()
plt.show()
C:\Users\Tudor\AppData\Local\Temp\ipykernel_9228\3993873374.py:4: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy eu_births2023.sort_values(by='OBS_VALUE', inplace=True)